Weighted Kolmogorov Smirnov testing: an alternative for Gene Set Enrichment Analysis.
نویسندگان
چکیده
Gene Set Enrichment Analysis (GSEA) is a basic tool for genomic data treatment. Its test statistic is based on a cumulated weight function, and its distribution under the null hypothesis is evaluated by Monte-Carlo simulation. Here, it is proposed to subtract to the cumulated weight function its asymptotic expectation, then scale it. Under the null hypothesis, the convergence in distribution of the new test statistic is proved, using the theory of empirical processes. The limiting distribution needs to be computed only once, and can then be used for many different gene sets. This results in large savings in computing time. The test defined in this way has been called Weighted Kolmogorov Smirnov (WKS) test. Using expression data from the GEO repository, tested against the MSig Database C2, a comparison between the classical GSEA test and the new procedure has been conducted. Our conclusion is that, beyond its mathematical and algorithmic advantages, the WKS test could be more informative in many cases, than the classical GSEA test.
منابع مشابه
Approximations for weighted Kolmogorov-Smirnov distributions via boundary crossing probabilities
A statistical application to Gene Set Enrichment Analysis implies calculating the distribution of themaximum of a certain Gaussian process, which is a modification of the standard Brownian bridge. Using the transformation into a boundary crossing problem for the Brownian motion and a piecewise linear boundary, it is proved that the desired distribution can be approximated by an n-dimensional Ga...
متن کاملdslice: an R package for nonparametric testing of associations with application in QTL and gene set analysis
UNLABELLED Many statistical problems in bioinformatics and genetics can be formulated as the testing of associations between a categorical variable and a continuous variable. A dynamic slicing method was proposed for non-parametric dependence testing, which has been demonstrated to have higher powers compared with traditional methods such as Kolmogorov-Smirnov test. We introduce an R package ds...
متن کاملThe XL-mHG test for gene set enrichment
7 The nonparametric minimum hypergeometric (mHG) test is a popular alternative to Kolmogorov-Smirnov (KS)-type tests for determining gene set enrichment. However, these approaches have not been compared to each other in a quantitative manner. Here, I first perform a simulation study to show that the mHG test is significantly more powerful than the one-sided KS test for detecting gene set enrich...
متن کاملDual KS: Defining Gene Sets with Tissue Set Enrichment Analysis
BACKGROUND Gene set enrichment analysis (GSEA) is an analytic approach which simultaneously reduces the dimensionality of microarray data and enables ready inference of the biological meaning of observed gene expression patterns. Here we invert the GSEA process to identify class-specific gene signatures. Because our approach uses the Kolmogorov-Smirnov approach both to define class specific sig...
متن کاملWeighted Kolmogorov-Smirnov test: accounting for the tails.
Accurate goodness-of-fit tests for the extreme tails of empirical distributions is a very important issue, relevant in many contexts, including geophysics, insurance, and finance. We have derived exact asymptotic results for a generalization of the large-sample Kolmogorov-Smirnov test, well suited to testing these extreme tails. In passing, we have rederived and made more precise the approximat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Statistical applications in genetics and molecular biology
دوره 14 3 شماره
صفحات -
تاریخ انتشار 2015